Analytical Mean Squared Error Curves in Temporal Difference Learning

نویسندگان

  • Satinder P. Singh
  • Peter Dayan
چکیده

Peter Dayan Brain and Cognitive Sciences E25-210, MIT Cambridge, MA 02139 [email protected] We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal difference value estimation algorithms change with offline updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its stepsize and eligibility trace parameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analytical Mean Squared Error Curves in Temporal Di erence Learning

We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal di erence value estimation algorithms change with o ine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its steps...

متن کامل

Analytical Mean Squared Error Curves in Temporal Diierence Learning

We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal diierence value estimation algorithms change with ooine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its step-...

متن کامل

Evaluation of remote sensing indicators in drought monitoring using machine learning algorithms (Case study: Marivan city)

Remote sensing indices are used to analyze the Spatio-temporal distribution of drought conditions and to identify the severity of drought. This study, using various drought indices generated from Madis and TRMM satellite data extracted from Google Earth Engine (GEE) platform. Drought conditions in Marivan city from February to November for the years 2001 to 2017 were analyzed based on spatial a...

متن کامل

Using Machine Learning ARIMA to Predict the Price of Cryptocurrencies

The increasing volatility in pricing and growing potential for profit in digital currency have made predicting the price of cryptocurrency a very attractive research topic. Several studies have already been conducted using various machine-learning models to predict crypto currency prices. This study presented in this paper applied a classic Autoregressive Integrated Moving Average(ARIMA) model ...

متن کامل

Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging

We consider d-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate θ∗ ∈ R (that is an optimum or a fixed point) using noisy data and O(d) updates per iteration. In this paper, we ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996